Levels of certainty in knowledge-intensive corpora: an initial annotation study
نویسندگان
چکیده
In this initial annotation study, we suggest an appropriate approach for determining the level of certainty in text, including classification into multiple levels of certainty, types of statement and indicators of amplified certainty. A primary evaluation, based on pairwise inter-annotator agreement (IAA) using F1-score, is performed on a small corpus comprising documents from the World Bank. While IAA results are low, the analysis will allow further refinement of the created guidelines.
منابع مشابه
Factuality Levels of Diagnoses in Swedish Clinical Text
Different levels of knowledge certainty, or factuality levels, are expressed in clinical health record documentation. This information is currently not fully exploited, as the subtleties expressed in natural language cannot easily be machine analyzed. Extracting relevant information from knowledge-intensive resources such as electronic health records can be used for improving health care in gen...
متن کاملMeta-Knowledge Annotation of Bio-Events
Biomedical corpora annotated with event-level information provide an important resource for the training of domain-specific information extraction (IE) systems. These corpora concentrate primarily on creating classified, structured representations of important facts and findings contained within the text. However, bio-event annotations often do not take into account additional information (meta...
متن کاملAn annotation scheme for Persian based on Autonomous Phrases Theory and Universal Dependencies
A treebank is a corpus with linguistic annotations above the level of the parts of speech. During the first half of the present decade, three treebanks have been developed for Persian either originally or subsequently based on dependency grammar: Persian Treebank (PerTreeBank), Persian Syntactic Dependency Treebank, and Uppsala Persian Dependency Treebank (UPDT). The syntactic analysis of a sen...
متن کاملComparative Study of the Academic Vocabulary Content of Electronic Engi-neering Corpora, GE Materials and M.S. Entrance Examinations
The importance of vocabulary learning has been underlined in the field of English for Academic Purposes (EAP) because non-English majors who require reading English texts in their fields of study have to expand their English vocabulary knowledge much more efficiently than ordinary ESL/EFL learners. Since academic vocabulary instruction in Iranian universities is realized through the use of Gene...
متن کاملDialogue Act Annotation with the ISO 24617-2 Standard
This chapter describes recent and ongoing annotation efforts using the ISO 24617-2 standard for dialogue act annotation. Experimental studies are reported on the annotation by human annotators and by annotation machines of some of the specific features of the ISO annotation scheme, such as its multidimensional annotation of communicative functions, the recognition of each of its nine dimensions...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010